EXPLORATORY ANALYSIS ON NIGERIA'S WORLD BANK DATASET¶

The Objecitve of this analysis is to practise some data anaysis using python programming language on world bank's data on Nigeria. This will involve:

  • Data sourcing
    • Data Pre-processing and cleaning
    • Exploratory Data Analysis
    • Data Visualization
    • Data interpretation

About Nigeria¶

Nigeria, an African country on the Gulf of Guinea, has many natural landmarks and wildlife reserves. Protected areas such as Cross River National Park and Yankari National Park have waterfalls, dense rainforest, savanna and rare primate habitats. One of the most recognizable sitmes is Zuma Rock, a 725m-tall monolith outside the capital of Abuja that’s pictured on the national currency.

Some Insights¶

  • Nigeria's GDP growth rate (annual %) is still trailing its historical average
  • Inflation is on the increase despite the slow grwoth in GDP
  • Net Migration data still negative as there are more people seem to prefer to seek "greener pastures" outside the country
  • Contribution of Trade to GDP is experiencing some recovery

Import Necessary Libraries¶

In [4]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns 
import plotly.express as px 
In [ ]:
 
In [5]:
# load datadet. source: Word Bank Website (https://data.worldbank.org/country/NG)
df=pd.read_csv('API_NGA_DS2_en_csv_v2_5455596.csv')

View the Dataset¶

In [6]:
df.head()
Out[6]:
Data Source World Development Indicators Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 Unnamed: 7 Unnamed: 8 Unnamed: 9 ... Unnamed: 57 Unnamed: 58 Unnamed: 59 Unnamed: 60 Unnamed: 61 Unnamed: 62 Unnamed: 63 Unnamed: 64 Unnamed: 65 Unnamed: 66
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 Last Updated Date 5/10/2023 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 ... 2013.0 2014.0 2015.0 2016.000000 2017.0 2018.0 2019.000000 2020.0 2021.0 2022.0
4 Nigeria NGA Intentional homicides (per 100,000 people) VC.IHR.PSRC.P5 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN 33.604193 NaN NaN 21.740789 NaN NaN NaN

5 rows × 67 columns

In [ ]:
 
In [42]:
# Lets make a copy of the original dataset 
df_copy =df.copy()
In [ ]:
 

Data Preprocessing¶

In [43]:
# Lets drop the first 3 rows and create a copy of the dataset  
df_copy=df_copy.drop([0,1,2], axis =0)
In [44]:
# Check the result after dropping first 3 rows 
df_copy.head()
Out[44]:
Data Source World Development Indicators Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6 Unnamed: 7 Unnamed: 8 Unnamed: 9 ... Unnamed: 57 Unnamed: 58 Unnamed: 59 Unnamed: 60 Unnamed: 61 Unnamed: 62 Unnamed: 63 Unnamed: 64 Unnamed: 65 Unnamed: 66
3 Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 ... 2013.000000 2014.000000 2015.000000 2016.000000 2017.000000 2018.000000 2019.000000 2020.000000 2021.000000 2022.0
4 Nigeria NGA Intentional homicides (per 100,000 people) VC.IHR.PSRC.P5 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN 33.604193 NaN NaN 21.740789 NaN NaN NaN
5 Nigeria NGA Internally displaced persons, new displacement... VC.IDP.NWDS NaN NaN NaN NaN NaN NaN ... 117000.000000 3000.000000 100000.000000 78000.000000 122000.000000 613000.000000 157000.000000 279000.000000 24000.000000 NaN
6 Nigeria NGA Voice and Accountability: Percentile Rank, Upp... VA.PER.RNK.UPPER NaN NaN NaN NaN NaN NaN ... 30.516432 34.482758 40.886700 41.871922 41.379311 37.198067 37.681160 34.782608 34.299519 NaN
7 Nigeria NGA Voice and Accountability: Estimate VA.EST NaN NaN NaN NaN NaN NaN ... -0.693028 -0.587156 -0.372614 -0.319363 -0.339919 -0.430503 -0.434371 -0.580638 -0.636556 NaN

5 rows × 67 columns

We can observe from the above table that we dont have the desired column headers. The preferred header title can be found in row 3. We will need to take steps to convert row (index) 3 to column header

In [45]:
# Covert row 3 to column headers 
df_copy.columns=df_copy.iloc[0]
In [46]:
#  Checking result 
df_copy.head(2)
Out[46]:
3 Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 ... 2013.0 2014.0 2015.0 2016.0 2017.0 2018.0 2019.0 2020.0 2021.0 2022.0
3 Country Name Country Code Indicator Name Indicator Code 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 ... 2013.0 2014.0 2015.0 2016.000000 2017.0 2018.0 2019.000000 2020.0 2021.0 2022.0
4 Nigeria NGA Intentional homicides (per 100,000 people) VC.IHR.PSRC.P5 NaN NaN NaN NaN NaN NaN ... NaN NaN NaN 33.604193 NaN NaN 21.740789 NaN NaN NaN

2 rows × 67 columns

In [ ]:
 
In [47]:
# drop redundant columns and save in new variable 
df_copy= df_copy.drop(['Country Name','Country Code','Indicator Code'], axis=1)
In [48]:
# Check the  result 
df_copy.head(2)
Out[48]:
3 Indicator Name 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 1966.0 1967.0 1968.0 ... 2013.0 2014.0 2015.0 2016.0 2017.0 2018.0 2019.0 2020.0 2021.0 2022.0
3 Indicator Name 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 1966.0 1967.0 1968.0 ... 2013.0 2014.0 2015.0 2016.000000 2017.0 2018.0 2019.000000 2020.0 2021.0 2022.0
4 Intentional homicides (per 100,000 people) NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN 33.604193 NaN NaN 21.740789 NaN NaN NaN

2 rows × 64 columns

In [50]:
#TTranspose Indicator Name rows to columns 
df_copy.set_index('Indicator Name').T.head(2).copy()
Out[50]:
Indicator Name Indicator Name Intentional homicides (per 100,000 people) Internally displaced persons, new displacement associated with disasters (number of cases) Voice and Accountability: Percentile Rank, Upper Bound of 90% Confidence Interval Voice and Accountability: Estimate High-technology exports (current US$) Merchandise exports to low- and middle-income economies within region (% of total merchandise exports) Merchandise exports to low- and middle-income economies in South Asia (% of total merchandise exports) Merchandise exports to low- and middle-income economies in East Asia & Pacific (% of total merchandise exports) Merchandise exports to economies in the Arab World (% of total merchandise exports) ... School enrollment, primary, female (% gross) Primary education, pupils Educational attainment, at least completed primary, population 25+ years, female (%) (cumulative) Primary school starting age (years) School enrollment, preprimary, male (% gross) Preprimary education, duration (years) School enrollment, primary (gross), gender parity index (GPI) Literacy rate, adult female (% of females ages 15 and above) Literacy rate, youth female (% of females ages 15-24) Regulatory Quality: Percentile Rank
3
1960.0 1960.0 NaN NaN NaN NaN NaN 0.692941 0.303162 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1961.0 1961.0 NaN NaN NaN NaN NaN 0.864375 0.349866 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

2 rows × 1479 columns

In [51]:
# Drop row with index number 3 from the dataset since it has been used has colum headers
df_copy2=df_copy.drop([3], axis =0).copy()
In [53]:
df_copy2.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1478 entries, 4 to 1481
Data columns (total 64 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Indicator Name  1478 non-null   object 
 1   1960.0          170 non-null    float64
 2   1961.0          195 non-null    float64
 3   1962.0          215 non-null    float64
 4   1963.0          232 non-null    float64
 5   1964.0          234 non-null    float64
 6   1965.0          239 non-null    float64
 7   1966.0          235 non-null    float64
 8   1967.0          238 non-null    float64
 9   1968.0          243 non-null    float64
 10  1969.0          244 non-null    float64
 11  1970.0          341 non-null    float64
 12  1971.0          357 non-null    float64
 13  1972.0          363 non-null    float64
 14  1973.0          356 non-null    float64
 15  1974.0          340 non-null    float64
 16  1975.0          345 non-null    float64
 17  1976.0          350 non-null    float64
 18  1977.0          422 non-null    float64
 19  1978.0          418 non-null    float64
 20  1979.0          415 non-null    float64
 21  1980.0          400 non-null    float64
 22  1981.0          527 non-null    float64
 23  1982.0          538 non-null    float64
 24  1983.0          550 non-null    float64
 25  1984.0          544 non-null    float64
 26  1985.0          559 non-null    float64
 27  1986.0          560 non-null    float64
 28  1987.0          541 non-null    float64
 29  1988.0          552 non-null    float64
 30  1989.0          569 non-null    float64
 31  1990.0          697 non-null    float64
 32  1991.0          685 non-null    float64
 33  1992.0          702 non-null    float64
 34  1993.0          678 non-null    float64
 35  1994.0          674 non-null    float64
 36  1995.0          707 non-null    float64
 37  1996.0          759 non-null    float64
 38  1997.0          702 non-null    float64
 39  1998.0          737 non-null    float64
 40  1999.0          766 non-null    float64
 41  2000.0          878 non-null    float64
 42  2001.0          796 non-null    float64
 43  2002.0          832 non-null    float64
 44  2003.0          919 non-null    float64
 45  2004.0          846 non-null    float64
 46  2005.0          919 non-null    float64
 47  2006.0          954 non-null    float64
 48  2007.0          940 non-null    float64
 49  2008.0          964 non-null    float64
 50  2009.0          917 non-null    float64
 51  2010.0          1028 non-null   float64
 52  2011.0          992 non-null    float64
 53  2012.0          928 non-null    float64
 54  2013.0          992 non-null    float64
 55  2014.0          989 non-null    float64
 56  2015.0          999 non-null    float64
 57  2016.0          981 non-null    float64
 58  2017.0          920 non-null    float64
 59  2018.0          1027 non-null   float64
 60  2019.0          919 non-null    float64
 61  2020.0          836 non-null    float64
 62  2021.0          656 non-null    float64
 63  2022.0          54 non-null     float64
dtypes: float64(63), object(1)
memory usage: 739.1+ KB

Split Columns¶

Here we will try to use the indicator names as column headers and the years column as row index

In [59]:
# Select only column header titled indicator name
df_indicator_name=df_copy2['Indicator Name']

# Slect all date columns without indicator name 
df_date_series = df_copy2.iloc[:,1:]
In [63]:
df_indicator_name= df_indicator_name.reindex()
In [64]:
df_indicator_name=pd.DataFrame(df_indicator_name)
In [65]:
# This is to create a new dataframe for ease of transposition with new index  
df_combined=pd.concat([df_indicator_name, df_date_series], ignore_index=True)
In [67]:
df_combined.head(3)
Out[67]:
Indicator Name 1960.0 1961.0 1962.0 1963.0 1964.0 1965.0 1966.0 1967.0 1968.0 ... 2013.0 2014.0 2015.0 2016.0 2017.0 2018.0 2019.0 2020.0 2021.0 2022.0
0 Intentional homicides (per 100,000 people) NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 Internally displaced persons, new displacement... NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 Voice and Accountability: Percentile Rank, Upp... NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

3 rows × 64 columns

In [68]:
# Generate a list from the values in indicator name column and save to a list 
df_list = df_combined['Indicator Name'].drop_duplicates().to_list()
In [69]:
# Lets check the length of the list to ensure it will fit into the new dataframe we want to create 
len(df_list)
Out[69]:
1479
In [70]:
# Here we just want to create a sliced dataframe of only the date columns and reset the index 
date_indexed =pd.DataFrame(df_date_series.T).reset_index()
In [71]:
date_indexed.head(3)
Out[71]:
3 4 5 6 7 8 9 10 11 12 ... 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481
0 1960.0 NaN NaN NaN NaN NaN 0.692941 0.303162 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 1961.0 NaN NaN NaN NaN NaN 0.864375 0.349866 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 1962.0 NaN NaN NaN NaN NaN 2.906853 0.084872 NaN 1.31551 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

3 rows × 1479 columns

In [72]:
# Create a copy of the year column and insert in the dataset   
date_indexed['year']=date_indexed[3]
In [73]:
# now lets replace the index column to maintain the size of the dataframe  
df_cols=date_indexed.drop([3], axis=1)
In [74]:
df_cols.head(3)
Out[74]:
4 5 6 7 8 9 10 11 12 13 ... 1473 1474 1475 1476 1477 1478 1479 1480 1481 year
0 NaN NaN NaN NaN NaN 0.692941 0.303162 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1960.0
1 NaN NaN NaN NaN NaN 0.864375 0.349866 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1961.0
2 NaN NaN NaN NaN NaN 2.906853 0.084872 NaN 1.31551 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1962.0

3 rows × 1479 columns

In [75]:
# Rename the column headers with indicator values and check if it fits the date column headers replaced 
df_cols.columns=df_list
In [76]:
# This is to replace the Column header as Year instead of NaN
df_cols.columns =df_cols.columns.fillna("Year")
In [77]:
df_cols.head(2)
Out[77]:
Intentional homicides (per 100,000 people) Internally displaced persons, new displacement associated with disasters (number of cases) Voice and Accountability: Percentile Rank, Upper Bound of 90% Confidence Interval Voice and Accountability: Estimate High-technology exports (current US$) Merchandise exports to low- and middle-income economies within region (% of total merchandise exports) Merchandise exports to low- and middle-income economies in South Asia (% of total merchandise exports) Merchandise exports to low- and middle-income economies in East Asia & Pacific (% of total merchandise exports) Merchandise exports to economies in the Arab World (% of total merchandise exports) ICT goods exports (% of total goods exports) ... Primary education, pupils Educational attainment, at least completed primary, population 25+ years, female (%) (cumulative) Primary school starting age (years) School enrollment, preprimary, male (% gross) Preprimary education, duration (years) School enrollment, primary (gross), gender parity index (GPI) Literacy rate, adult female (% of females ages 15 and above) Literacy rate, youth female (% of females ages 15-24) Regulatory Quality: Percentile Rank Year
0 NaN NaN NaN NaN NaN 0.692941 0.303162 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1960.0
1 NaN NaN NaN NaN NaN 0.864375 0.349866 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1961.0

2 rows × 1479 columns

In [78]:
# Change the year column from float data to integer 
df_cols['Year']=df_cols['Year'].astype('int', copy=True)
In [80]:
df_cols.head(3)
Out[80]:
Intentional homicides (per 100,000 people) Internally displaced persons, new displacement associated with disasters (number of cases) Voice and Accountability: Percentile Rank, Upper Bound of 90% Confidence Interval Voice and Accountability: Estimate High-technology exports (current US$) Merchandise exports to low- and middle-income economies within region (% of total merchandise exports) Merchandise exports to low- and middle-income economies in South Asia (% of total merchandise exports) Merchandise exports to low- and middle-income economies in East Asia & Pacific (% of total merchandise exports) Merchandise exports to economies in the Arab World (% of total merchandise exports) ICT goods exports (% of total goods exports) ... Primary education, pupils Educational attainment, at least completed primary, population 25+ years, female (%) (cumulative) Primary school starting age (years) School enrollment, preprimary, male (% gross) Preprimary education, duration (years) School enrollment, primary (gross), gender parity index (GPI) Literacy rate, adult female (% of females ages 15 and above) Literacy rate, youth female (% of females ages 15-24) Regulatory Quality: Percentile Rank Year
0 NaN NaN NaN NaN NaN 0.692941 0.303162 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1960
1 NaN NaN NaN NaN NaN 0.864375 0.349866 NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1961
2 NaN NaN NaN NaN NaN 2.906853 0.084872 NaN 1.31551 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN 1962

3 rows × 1479 columns

In [82]:
# Save the nw dataframe 
df_new = df_cols

df_new = df_new.round(1)

# Lets set the year roows as index 
df_new.index = df_new['Year']

Checking for Missing Values¶

In [84]:
# Check percentage of missing values and output as a dataframe 
check_missing_values =pd.DataFrame((df_new.isnull().sum()/df_new.shape[0])*100).round(2).sort_values(by= 0, ascending=False)
check_missing_values=check_missing_values.set_axis(['Missing Values%'], axis=1)
check_missing_values.columns.rename('Indicator Name', inplace=True)
check_missing_values.head()
Out[84]:
Indicator Name Missing Values%
Educational attainment, Doctoral or equivalent, population 25+, female (%) (cumulative) 100.0
Survey mean consumption or income per capita, bottom 40% of population (2017 PPP $ per day) 100.0
Customs and other import duties (current LCU) 100.0
Taxes on exports (% of tax revenue) 100.0
Social contributions (% of revenue) 100.0
In [ ]:
 
  • only Net migration, Merchandise imports , Population in urban agglomerations of more than 1 million (% of total population),Population in largest city, Population in the largest city (% of urban population) and year columns have complete entries (0% missing values)
  • It can be observed from the above that some economic parameters or indicators have missing values of more than 50% of the 63 years of data enty . The novelity of the data may be responisble for this

  • We may have to drop or select datapoint with adequate entries for further analysis. But first lets try to indicators with 95-100% missing values

create a loop to delete multiple columns with missing values¶

In [86]:
# lets use a tight threshold of 10% to capture more data 
df_dropped_cols = df_new
for i in df_dropped_cols.columns:
    if (df_dropped_cols[i].isnull().sum()/df_dropped_cols.shape[0])*100 >5.0:
        del df_dropped_cols[i]
In [ ]:
 
In [87]:
# Lets check the result under a new variable df_dropped 
check_missing_values =pd.DataFrame((df_dropped_cols.isnull().sum()/df_dropped_cols.shape[0])*100).round(2).sort_values(by= 0, ascending=False)
check_missing_values=check_missing_values.set_axis(['Missing Values%'], axis=1)
check_missing_values.columns.rename('Indicator Name', inplace=True)
check_missing_values.head()
Out[87]:
Indicator Name Missing Values%
Merchandise exports to low- and middle-income economies within region (% of total merchandise exports) 4.76
Renewable internal freshwater resources, total (billion cubic meters) 4.76
Arable land (hectares) 4.76
Merchandise exports by the reporting economy, residual (% of total merchandise exports) 4.76
Merchandise exports to high-income economies (% of total merchandise exports) 4.76

Here we have now captured only indicators that have less than 5% missing values.We have also succeeded in trimming the size of the dataset to capture indicators that can give better information. We can now proceed to carry out our analysis

In [ ]:
 
In [88]:
# Lets fill the missing values with zeros  
df_filled=df_dropped_cols.fillna(0).copy()
In [89]:
# Check if the result after dealing with missing values 
df_filled.isnull().sum().head()
Out[89]:
Merchandise exports to low- and middle-income economies within region (% of total merchandise exports)              0
Merchandise imports from low- and middle-income economies in Sub-Saharan Africa (% of total merchandise imports)    0
Merchandise imports (current US$)                                                                                   0
Urban population                                                                                                    0
Rural population                                                                                                    0
dtype: int64
In [ ]:
 

Exploratory Analysis¶

What is the Trend of Net Migration in Nigeria since 1960?¶

In [91]:
plt.figure(figsize=(15,5))
ax=sns.barplot(data=df_filled,  y= df_filled['Net migration'], x=df_filled['Year'])
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, ha="right")
plt.tight_layout()
plt.show()

Nigeria's Net Migration has been negative. Net migration is the net total of migrants during the period, that is, the number of immigrants minus the number of emigrants, including both citizens and noncitizens. From the above chart we see a huge bar of negative Net Migration in 1984. This may be an outlier. More investigation is required to understand what was responsibe for the huge negative

GDP growth (annual %)¶

According to the long defintion of the World Bank, GDP is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products. It is calculated without making deductions for depreciation of fabricated assets or for depletion and degradation of natural resources.

What is the average GDP growth rate since 1960?¶

In [95]:
df_new['GDP growth (annual %)'].mean().round(2)
Out[95]:
3.68

What is the Highest GDP Growth Rate since 1960¶

In [96]:
df_new['GDP growth (annual %)'].max()
Out[96]:
25.0
In [97]:
# Find The Year when 
df_new['GDP growth (annual %)'].loc[df_new['GDP growth (annual %)']==25]
Out[97]:
Year
1970    25.0
Name: GDP growth (annual %), dtype: float64

What is the Lowest GDP Growth Rate since 1960¶

In [100]:
df_new['GDP growth (annual %)'].min()
Out[100]:
-15.7

What was the last GDP Growth Recorded ?¶

In [102]:
# This is to find the last non null or zero value and return the previous
x=len(df_filled['GDP growth (annual %)'])
    
for i in reversed(range(x)):
    if df_filled['GDP growth (annual %)'].iloc[i]==0.0:
          if df_filled['GDP growth (annual %)'].iloc[(i-1)]!=0:
            print (df_filled[['GDP growth (annual %)','Year']].iloc[(i-1)] )   
GDP growth (annual %)       3.6
Year                     2021.0
Name: 2021, dtype: float64

Observations:¶

  • Nigeria recorded a positive growth rate in 2021 after a negative growth in prior year. According to World Bank Data, Nigeria's GDP growth Rate closed at 3.60% within the average of 3.68 since 1960. The highest GDP growth recorded was in 1970

What is the Trend in Nigeria's Trade (% of GDP) since 1960¶

In [105]:
plt.figure(figsize=(15,5))
ax=sns.barplot(data=df_new,  y= df_filled['Trade (% of GDP)'], x=df_filled['Year'])
ax.set_xticklabels(ax.get_xticklabels(), rotation=90, ha="right")
plt.tight_layout()
plt.show()
In [ ]:
 

Is there any correlation between GDP growth and Inflation between 2000 and 2021¶

In [107]:
# Create a dataframe selecting only 21 years of Data 

df_10y = df_filled.query("Year >= 2000 and Year <2022")[['Inflation, consumer prices (annual %)', 'GDP growth (annual %)']]
In [108]:
# Check correlations 
df_10y.corr()
Out[108]:
Inflation, consumer prices (annual %) GDP growth (annual %)
Inflation, consumer prices (annual %) 1.000000 -0.108584
GDP growth (annual %) -0.108584 1.000000
  • There seems to be a slight negative correlation of 10.85% between inflation,consumer prices and GDP growth rate within the last 22 years in Nigeria. Using a formula may not tell us much about the correlation between both indicators. May be a graph may do a better job
In [ ]:
 
In [110]:
# Plot data 
px.area(data_frame=df_10y)

Inflation has been trending upwards with Nigeria's slow GDP growth as seen from the charts. It could pose serious concern for the economy.

In [ ]:
 

Conclusions¶

This analysis was mainly to show case data analytics without delving into the nuts and bolts of economic mechanics. Although it is worthy of note that some key economic indicators as presented by the World Bank data would suggest that the economy may be lagging behind its true potential particularly when revewing GDP growth Rate, Net Migration, Trade and inflation indicators. These areas I want to believe if worked on by the new administration can have psotive impact on the economy at large. Based on the dataset, there are a barrage of other economic factors to consider by the government which makes data analysis (with hthe aid of python programming language) of great importance